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Appendix A 

Computation of Principal Components 



This Appendix is the only paxt of the book that has shrunk — ^ 
Zd performing related analyses, which were then available m five of 

quotations from the first edition. ..... 

Despite the likelihood that personal computers will become the 
mS tool for . . . users of PCA . . . [it] is still usually carried out 
"nframe computers . . . [T]he author has no experience yet 
of PCA on personal computers. 

MINITAB does not have any direct instructions for finding PCs. 
Five packages were described in the first edition-BMDP GENSTAT, 
mSStSsS aid SPSS*. Since then a number a new packages or ton- 
^ZeZ^^s the biggest change is JJ^J-jg 
Se by statisticians of S-PLUS and its 'ope n source' 
LAB software should also be mentioned. Although it * not . 
statistical package, it has found increasing favour among statisticians as 
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programming environment within which new statistical techniques can be 
implemented. PCA is also included in some neural network software. 

All the main statistical software packages incorporate procedures for find- 
ing the basic results of a PCA. There are some variations in output, such as 
the choice of normalization constraints used for the vectors of loadings or 
coefficients. This can cause confusion for the unwary user (see Section 11.1), 
who may also be confused by the way in which some software erroneously 
treats PCA as a special case of factor analysis (see Chapter 7). However, 
even with this misleading approach, numerically correct answers axe pro- 
duced by all the major statistical software packages, and provided that the. 
user is careful to ensure that he or she understands the details of how the 
output is presented, it is usually adequate to use whichever software is most 
readily available. 

Most statistical packages produce the basic results of a PCA satisfacto- 
rily, but few provide much in the way of extensions or add-ons as standard 
features. Some allow (or even encourage) rotation, though not necessarily 
in a sufficiently flexible manner to be useful, and some will display biplots. 
With most it is fairly straightforward to use the output from PCA in an- 
other part of the package so that PC regression, or discriminant or cluster 
analysis using PCs instead of the measured variables (see Chapters 8 and 
9) are easily done. Beyond that, there are two possibilities for. many of the 
extensions to PCA. Either software is available from the originator of the 
technique, or extra functions or code* 5 can 0 be added to the more flexible 
software, such as S-PLUS or R. 

A.l Numerical Calculation of Principal 
Components 

Most users of PCA, whether statisticians or non-statisticians, have little 
desire to know about efficient algorithms for computing PCs. Typically, a 
statistical program or package can be accessed that performs the analysis 
automatically. Thus, the user does not need to write his or her own pro- 
grams; often the user has little or no interest in whether or not the software 
available performs its analyses efficiently. As long as the results emerge, the 
user is satisfied. 

However, the type of algorithm used can be important, in particular if 
some of the last few PCs are of interest or if the data set is very large. 
Many programs for PCA are geared to looking mainly at the first few PCs, 
especially if PCA is included only as part of a factor analysis routine. In 
this case, several algorithms can be used successfully, although some will 
encounter problems if any pairs of the eigenvalues are very close together. 
When the last few or all of the PCs are to be calculated, difficulties are more 
likely to arise for some algorithms, particularly if some of the eigenvalues 
are very small. 



A.l. Numerical Calculation of Principal Components 409 
The Power Method 

A form or the power method was SSfflT- 
original paper on PCA and an the power method is a 

presented in Hotelhng (1936). In its i srrnp oorresponding ergen- 

vector uo, and then form the sequence 

ui = Tu 0 

u 2 = Tu x = T 2 u 0 
Ur = Tu r _i = T r u 0 

* « nf T then they form a basis for 
n, are the eigenvectors 01 l, tnen ^ y 
If ax,a 2 , • • • ; «P «» 6 it for arbitrary u 0) 
p-dimensional space, and we can wu , „ , ;; 

u 0 = J! Kk(Xk 



fc=i 



for some set of constants «x, k 2 , . . . , * P - Then^ 

Ul = Tuo = f>Ta h = X> Afeafc ' 
.here A,*,, ... A axe the ei^alues of ^nuing, we g et for r = 



2,3,... 



and 



v 

1 

fc=i 



u, r W^Ya 2 +---+^(rV ap ) 



{kiK) V Kl 1 distinct from the remaining 

Assuming that the first eigenvalue of T » 0*^ ^ & guitably nor . 
eigenvalues, so that A x > A 2 ^ • _ P> ^ foUow£j ^ the ratio s of 
maUzed version of u r - «i as r • r 
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derived. It works well if Ai » A2, but converges only slowly if Ai is not well 
separated from A2. Speed of convergence also depends on the choice of the 
initial vector uo; convergence is most rapid if uo is close to aj, 

If Ai = A2 > A3, a similar argument to that given above shows that a 
suitably normalized version of u r — > ai + (^2/^1)^2 as r — > 00. Thus, 
the method does not lead to but it still provides information about 
the space spanned by c*i, ol^ Exact equality of eigenvalues is extremely 
unlikely for sample covariance or correlation matrices, so we need not worry 
too much about this ease. 

Rather than looking at all u r , r = l,2,3,..., attention can be restricted 
to u x , u 2 , 114, us, . . . (that is Tu 0 , T 2 u 0 , T 4 u 0 , T 8 uo, . . .) by simply squaring 
each successive power of T. This accelerated version of the power method 
was suggested by Hotelling (1936). The power method can be adapted to 
find the second, third, ...PCs, or the last few PCs (see Morrison, 1976, 
p. 281), but it is likely to encounter convergence problems if eigenvalues 
are close together, and accuracy diminishes if several PCs are found by the 
method. Simple worked examples for the first and later components can be 
found in Hotelling (1936) and Morrison (1976, Section 8.4) . 

There are various adaptations to the power method that partially over- 
come some of the problems just mentioned. A large number of such 
adaptations are discussed by Wilkinson (1965, Chapter. 9), although some 
are not directly relevant to positive-semidefinite matrices such as covari- 
ance or correlation matrices. Two ideas that are of use for such matrices 
will be mentioned here. First, the origin can be shifted, that is the matrix 
T is replaced by T - where I p is the identity matrix, and p is chosen 
to make the ratio of the first two eigenvalues of T — pl p much larger than 
the corresponding ratio for T, hence speeding up convergence. 

A second modification is to use inverse iteration (with shifts), in which 
case the iterations of the power method are used but with (T — pip)"" 1 
replacing T. This modification has the advantage over the basic power 
method with shifts that, by using appropriate choices of p (different for 
different eigenvectors), convergence to any of the eigenvectors of T can 
be achieved. (For the basic method it is only possible to converge in the 
first instance to a x or to a p .) Furthermore, it is not necessary to explicitly 
calculate the inverse of T— because the equation u r = (T— plp)~ x u r _i 
can be replaced by (T — pl p )u r = u r _i. The latter equation can then 
be solved using an efficient method for the solution of systems of linear 
equations (see Wilkinson, 1965, Chapter 4). Overall, computational savings 
with inverse iteration can be large compared to the basic power method 
(with or without shifts), especially for matrices with special structure, such 
as tridiagonal matrices. It turns out that an efficient way of computing 
PCs is to first transform the covariance or correlation matrix to tridiagonal 
form using, for example, either the Givens or Householder transformations 
(Wilkinson, 1965, pp. 282, 290), and then to implement inverse iteration 
with shifts on this tridiagonal form. 
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There is one problem with shifting the origin that has not yet been 
J££ This* the fact that to choose efficiently the ^ues < rfp** 
determine the shifts, we need some preliminary idea of the ■ ja™^ 
of 1 'This preliminary estimation can be achieved by using the method 
f L^on" which i/turn is based on the Sturm 

trirHan-nnal matrices Details will not be given here (see Wilkinson, iyoo, 

TSSX^ method p rovides a quick way of fi \ din ^ approx r£ 

ZZoi the eigenvalues of a tridiagonal matrix. In fact, bisection could be 
^d tc .find ^eigenvalues to any required degree of accuracy, and inverse 
iteration implemented solely to find the eigenvectors. 

Z ^ajl collections of subroutines for finding ^^"g 
vectors for a wide variety of classes of matrix are the EISPACK package 
rwh et al 1976), which is distributed by IMSL, and parts of the NAG 
of fuWines. In both of these collections, there are recommen- 
to which subroutines are most appropriate for various types of 
eSnTrobl em In the case where only a few of the eigenvalues and eigenvec- 
3 a real symmetric matrix are required 

a few of the PCs for a covariance or correlation matrix) both EISPACK 
Ld NAG recommend transforming to tridiagonal form using Householder 
formations, and then finding eigenvalues 

tion and inverse iteration respectively. NAG and EISPACK both base tMff 
subroutines on algorithms published in Wilkinson and R*nsch (1971) a. 
do the 'Numerical Recipes' for eigensystems given by Press et al. (1992, 



Chapter 11). 



The QL Algorithm 

If all of the PCs are required, then methods other than those just described 
ll ail oi tne r m mQPAfTK and NAG recommend 

at the second stage the sc-caUed 
he used instead of bisection and inverse iteration. Chapter 8 ot Wilkinson 
M 9oo sp^ds over 80 pages describing the QR and LR algorithms > (which 
SM^ated to theQL algorithm), but only a very brief outline will 

idea behind the QL algorithm is that any non-singula* ^ 
T can be written as T = QL, where Q is orthogonal and L is lower 
ZZZ X QR algorithm is similar, except that T is written instead 
T ^Te R is S upper triangular, rather than lower to-^O* 
Tl = T and we write T, = Qi L i, then T 2 is defined as T 2 = LiQ^J 
is the first step in an iterative procedure. At the next step T 2 s wrrtten 
as T2 = Q 2 L 2 and T 3 is defined as T, = * ^ Q* 

as Q r L r and T r+1 is then defined as L r Q„ r - 1, 2£ _ 
Q 3) . . . are orthogonal matrices, and L x , L 2 , L 3 , • • • are lower ^rianB 
can be shown that T r converges to a diagonal matrix, with the eigenvalues 
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of T in decreasing absolute size down the diagonal. Eigenvectors can be 
found by accumulating the transformations in the QL algorithm (Smith et 

al., 1976, p. 468). . 

As with the power method, the speed of convergence of the QL algorithm 
depends on the ratios of consecutive eigenvalues. The idea of incorporating 
shifts can again be implemented to improve the algorithm and, unlike the 
power method, efficient strategies exist for finding appropriate shifts that 
do not rely on prior information about the eigenvalues (see, for example, 
Lawson and Hanson (1974, p. 109)). The QL algorithm can also cope with 
equality between eigenvalues. > ■ 

It is probably fair to say that the algorithms described m detail by 
Wilkinson (1965) and Wilkinson and Reinsch (1971), and implemented, m 
various IMSL and NAG routines, have stood the test of time. They still 
provide efficient ways of computing PCs in many circumstances. However, 
we conclude the Appendix by discussing two alternatives. The first is imple- 
mentation via the singular value decomposition (SVD) of the data matrix, 
and the second consists of the various algorithms for PCA that have been 
suggested in. the neural networks literature. The latter is a large topic and 
will be summarized only briefly. , 

One other type of algorithm that has been used recently to find PCs is 
the EM algorithm (Dempster et al., 1977). This, is advocated by Tipping 
and Bishop (1999a,b) and Roweis (1997), and has its greatest value m cases 
where some of the data are missing (see Section 13.6). 



Singular Value Decomposition 

The suggestion that PCs may best be computed using the SVD of the 
data matrix (see Section 3.5) is not new. For example, Chambers (1977, p. 
Ill) talks about the SVD providing the best approach to computation of 
principal components and Gnanadesikan (1977, p. 10) states that '...the 
recommended algorithm for . . . obtaining the principal components is either 
the . . . QR method ... or the singular value decomposition.' In constructing 
the SVD, it turns out that similar algorithms to those given above can 
be used. Lawson and Hanson (1974, p. 110) describe an algorithm (see 
also Wilkinson and Reinsch (1971)) for finding the SVD, which has two 
stages; the first uses Householder transformations to transform to an upper 
bidiagonal matrix, and the second applies an adapted QR algorithm. The 
method is therefore not radically different from that described earlier. 

As noted at the end of Section 8.1, the SVD can also be useful in com- 
putations for regression (Mandel, 1982; Nelder, 1985), so the SVD has 
further advantages if PCA is used in conjunction with regression. Nash 
and Lefkovitch (1976) describe an algorithm that uses the SVD to provide 
a variety of results for regression, as well as PCs. 

Another point concerning the SVD is that it provides simultaneously not 
only the coefficients and variances for the PCs, but also the scores of each 
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option on each PC, and hence aU the InfoCon 

The covaxiance or — ^= whicn 
The values of the PC scores are re M ' *° f „ f p roperty G 4 

can be derived from the eigenvectors of X X (see tne P from 
Tsection 3.2); conversely, the ^^^n "smaller than the 
t nose of XX'. in ^stances ;where togg?^ XX , so 
number of variables p, XX has smaller a b8Sed on th e 

can be advantageous to use the <^T^. X th«m XX in such 
power method or QL method, on a multiple ol XX r& 
Les. Large computational savings "j^*^" £ fjoOO), which is 
spectroscopy or in the S»f 48 p = 4673. Algorithms also 

r^^""^* (see foi example 

Berry et al. (1995)). 
NeuraZ Network Algorithms 

generalizations (see Sections 14.1.3, 14.0_U. J differe „ ce between these 
iorithms for estimating 'ordinary PCs. The «»• fe ^ 

figorithms and the techniques d ^°^^ods H the whole of a data . 
most are .adaptive, rather ban «ch „ not Me, 

set is coUected before PCA is acme an i p ^ (sec Diar 

then batch methods S Sections 3.5.3, 4.4.1). On the 
mantaias and Kung, 1996 (hereaBer uw J, when new 

other hand, if data arrive «*«^ Algorithms come into 

data become available, then adapt m °^ e ™°< ^ of ^native 

. whether the first or last few PCs are of interest; 

. whether one or more than one PC is required; 

. whether individu^ PCs are wanted or whether subspaces spanned by 

several PCs will suffice; 
. whether the network is required to be biologicatty plausible. 
DK 96, Section 4.2.7 treat finding the last few PCs as a different technique, 
calling it minor component analysis. _ prties including speed, of 

In Lit Section 4.4, DK96 compare the propels ^ ^ 
seven algorithms using simulated data. In bectton 
layer networks. 
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Neural network algorithms are feasible for larger data sets than batch 
methods because they are better able to take advantage of developments 
in computer architecture. DK96, Chapter 8, discuss the potential for ex- 
ploiting parallel VSLI (very large scale integration) systems, where the 
most appropriate algorithms may be different from those for non-parallel 
systems (DK96, Section 3.5.5). They discuss both digital and analogue 
implementations and their pros and cons (DK96, Section 8.3). Classical 
eigenvector-based algorithms are not easily parallelizable, whereas neural 
network algorithms axe (DK96 pp. 205-207). 
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102, 104-107, 338, 341-343, 
353, 372, 373, 375, 385, 386, 
391 

multiple correspondence analysis 
102, 343, 375, 376 
Cramer-von Mises statistic 402 

decomposition into 'PCs' 402 
crime rates 147-149, 300 
cross-correlation asymmetric PCA 

40i 

cross-validation 112, 120-127, 131, 
132, 175, 177, 185, 187, 239, 
253 
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cumulative percentage of total 

variation 55, 112-114, 126, 
130-137, 147, 166, 201, 211, 
249 

see also rules for selecting PCs 
curve alignment /registration 316, 

323, 324 
cyclic subspace regression 184 
cyclo-stationary EOFs and POPs 

314-316 

DASCO (discriminant analysis 

with shrunken covariances) 
207, 208 

data given as intervals 339, 370, 
371 

data mining 200 

definitions of PCs 1-6, 18, 30, 36, 

377, 394 
demography 9, 68-71, 108-110, 

195-198, 215-219, 245-247 
density estimation 316, 327, 368 
derivation of PCs, see algebraic 
derivations, geometric 
derivations 
descriptive use of PCs 19, 24, 49, 
55, 59, 63-77, 130, 132, 159, 
263, 299, 338, 339 
see also first few PCs 
designed experiments 336, 338, 
351-354,365 
optimal design 354 
see also analysis of variance, 
between-treatment PCs, 
multivariate analysis of 
variance, optimal design, 
PCs of residuals 
detection of outliers 13, 168, 207, 
211, 233-248, 263, 268, 352 
masking or camouflage 235 
tests and test statistics 236-241, 

245, 251, 268 
see also influential observations, 
outliers 



dimensionality reduction 1-4, 46, 
74, 78, 107, 108, 111-150, 
160 

preprocessing using PCA 167, 
199,200,211,221,223,396, 
401 

redundant variables 27 
dimensionless quantity 24 
directional data 339, 369, 370 
discrete PC coefficients 269, 
284-286 

see also rounded PC coefficients, 

simplified PC coefficients 
discrete variables 69, 88, 103, 145, 

201, 339-343, 371, 388 
categorical variables 79, 156, 

375, 376 
measures of association and 

dispersion 340 
see also binary variables, 

contingency tables, 

discriminant analysis, Gini's 

measure of dispersion, 

ranked data 
discriminant analysis 73, 111, 129, 

137, 199-210, 212, 223, 335, 

351, 354, 357, 386, 408 
assumptions 200, 201, 206 
for discrete data 201 
for functional data 327 
linear discriminant function 201, 

203 

non-linear 206 

non-parametric 201 

probability/ cost of 

misclassification 199, 201, 
203, 209 

quadratic discrimination 206 

training set 200, 201 

see also between-group 
. variation, canonical 
variate (discriminant) 
analysis, regularized 
discriminant analysis, 
SIMCA, within-groups PCs 
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discriminant principal component 

analysis 209 
dissection 84, 210, 212, 214, 215, 

217, 219 
distance/dissimilarity measures 
between observations 79, 86, 
89, 90, 92, 93, 106, 200, 
210-212, 215, 348 
between variables 391 
geodesic distance 382 
see also Euclidean distance, 
Mahalanobis distance, 
similarity measures 
dominant PCs 22, 40, 42, 113, 131, 
134, 135, 263, 271, 276, 389 
doubly-centred PCA 42, 344, 372, 

374, 389-391 
duality between PCA and principal 

coordinate analysis 86-90 
duality diagram 386 

ecology 9, 117, 118, 130, 131, 224, 
261, 343, 371, 389 
habitat suitability 239 
see also biological applications 
economics and finance 9* 300, 329 
econometrics 188, 330, 393 
financial portfolio 404 
stock market 76, 77, 300 
eigenzeit 323 

elderly at home 68-71, 110 
ellipsoids, see concentration 
ellipses, contours of 
constant probability, 
interval estimation, 
principal axes of ellipsoids 
elliptical distributions 20, 264, 379, 
394 

El Nino-Southern Oscillation 

(ENSO) 73, 305, 306, 311 

EM algorithm 60, 222, 363, 364, 
412 

regularized EM algorithm 364 
empirical orthogonal functions 
(EOFs) 72, 74, 274, 296, 



297, 303, 320 
space-time EOFs 333 
see also cyclostationary EOFs, 

extended EOFs, Hilbert 

EOFs 

empirical orthogonal telecon- 
nections 284, 289, 290, 

390 

entropy 20, 219, 396 
equal-variance PCs 10, 27, 28, 43, 
44, 252, 410, 412 
nearly equal variances, see 

nearly equal eigenvalues 
see also hypothesis testing for 
equality of PC variances 
error covariance matrix 59, 387, 
400 

errors-in-variables models 188 
estimation, see bootstrap 
estimation, interval 
estimation, least 
squares (LS) estimation, 
t maximum likelihood 
'estimation, method of 
moments estimation, 
point estimation, robust 
estimation 
Euclidean norm 17, 37, 46, 113, 

380,387,392 
exact linear relationships between 
variables, see zero-variance 
PCs 

extended components 404 
extended EOF analysis (EEOF 

analysis) 307, 308, 333, 398, 

399 

multivariate EEOF analysis 307, 
308 

factor analysis 7, 8, 60, 115, 116, 
122, 123, 126, 127, 130-132, 
150-166, 269, 270, 272-274, 
296, 336, 357, 364, 396, 408 
comparisons with PCA 158-161 

factor rotation, see rotation 
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first few (high variance) PCs 
computation 408-411, 413 
dominated by single variables 

22, 24, 41, 56 
in canonical correlation analysis 

223, 224, 362 
in climate change detection 332, 

333 

in cluster analysis 211-219 

in discriminant analysis 200-202, 

207, 208 
in factor analysis 157-162 
in independent component 

analysis 396 
in outlier detection 234-236, 

238, 239, 263, 367 
in projection pursuit 221 
in regression 171-174, 186-188, 

191 

in variable selection 138, . 
186-188, 191, 197 

see also cumulative percentage of 
total variation, descriptive 
use of PCs, dimensionality 
reduction, dominant PCs, 
interpretation of PCs, 
residuals after fitting first 
few PCs, rotation, rules 
for selecting PCs, size and 
shape PCs, two-dimensional 
PC plots 

fixed effects model for PC A 59-61, 
86, 96, 124, 125, 131, 158, 
220, 267, 330, 376, 386 

Fourier analysis/transforms 311, 
329, 370 

frequency domain PCs 299, 310, 

328-330, 370 
multitaper frequency 

domain singular value 

decomposition (MTM-SVD) 

303, 311, 314 
functional and structural 

relationships 168, 188-190 
functional data 61, 266, 302, 



320-323, 331, 384, 387 
functional PCA (FPCA) 274, 

316-327, 384, 402 
bivaxiate FPCA 324 
estimating functional PCs 316, 

318-320, 327 
prediction of time series 316, 

326, 327 
robust FPCA 266, 316, 327 
see also rotation 

gamma distribution . 

probability plots 237, 239, 245 
gas chromatography, see chemistry 
Gaussian distribution, see normal 

distribution 
generalizations of PCA 60, 189, 

210, 220, 342, 360, 361, 

373-401 

generalized linear models 61, 185 

bilinear models 61 
generalized SVD, see gingular 

value decomposition 
generalized variance 16, 20 
genetics 9, 336, 413 
gene shaving 213 
geology 9, 42, 346, 389, 390 

trace element concentrations 248 
geometric derivation of PCs 7, 8, 

10, 36, 59, 87, 189 
geometric properties of PCs 7, 8, 

10, 18-21, 27, 29, 33-40, 46, 

53, 78, 80, 87, 113, 189, 212, 

320, 340, 347, 372 
statistical implications 18, 33 
Gini's measure of dispersion 340 
Givens transformation 410 
goodness-of-fit tests 317, 373, 401, 

402 

lack-of-fit test 379 
graphical representation 

comparing covariance matrices 
360 

dynamic graphics 79 
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of correlations between variables 

and components 404 
of data 63, 78-110, 130, 200, 
201, 212, 214, 215, 217-219, 
. 235-236, 242, 244-247, 338, 
341, 353, 367 
of intrinsically high-dimensional 
data 107-110 
group structure 80, 145, 299, 387, 
398 

see also cluster analysis, 
discriminant analysis 
growing scale PCA 334 
growth curves 328, 330, 331 

Hilbert EOF analysis see complex 
PCA 

Hilbert transforms 310, 329, 369 
history of PCs 6-9 
Hotelling's T 2 205, 356, 367, 368 
household formation 195-198, 
245-247 

Householder transformation 410, 
412 

how many factors? 116, 126, 130, 

131, 159, 162 
how many PCs? 43, 53, 54, 63, 78, 
111-137, 159, 222, 230, 238, 
253, 261, 271, 327, 332, 333, 
338, 385, 387, 395 
see also parallel analysis, rules 
for selecting PCs 
hypothesis testing 

for common PCs 356-360 
for cumulative proportion of 

total variance 55 
for equality of multivariate 

means 205 
for equality of PC variances 
53-55, 118, 119, 128, 131, 
132, 136, 276, 394 
for equivalence of subspaces 360 
for Hilbert EOFs 311 
for linear trend in PC variances 
120, 356 



for normality 402 

for outliers 236, 238, 239, 241, 

367, 368 
for periodicities 304, 314 
for specified PC coefficients 53, 

293, 394 
for specified PC variances 53, 

114, 394 
see aho likelihood ratio test, 

minimum x 2 test 

ill-conditioning 390 

• see also multicollinearity 
image processing 56, 346, 395, 401 

eigenimages 346 
Imbrie's Q-mode method 390 
imputation of missing values 363, 
366 

independent component analysis 

(ICA) 373, 395, 396 
indicator matrices 343 
inference for PCs see estimation, 
hypothesis testing, interval 
estimation 
influence function 
additivity 251 
deleted empirical influence 
function 251 - ; 
empirical influence function 251 
for PC coefficients and 
variances 253-259 
local 262 

sample influence function 249, 
250, 252 
for PC coefficients and 

variances 253-259 
for regression coefficients 249 
standardized influence matrix 
240 

theoretical influence function 
for correlation coefficients 250 
for PC coefficients and 
' variances 249-251, 253, 263 
for.robustPCA267 
for subspaces 252-254 
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influential observations 43, 81,123, 
232, 235, 239, 242, 248-259, 
265, 268 

tests of significance 253, 254, 268 
influential variables 145, 147, 260 
information see entropy 
instability see stability 
instrumental variables 144, 230, 

298, 373, 392-394, 401 
integer-valued PC coefficients, see 

discrete PC coefficients 
intelligence tests, see children 
interactions 104, 353, 390 

see also PCs of residuals 
inter-battery factor analysis 225, 

399 

interpolation see smoothing and 
interpolation 

interpretation of PCs and related 
techniques 22, 25, 40, 43, 
44, 56-58, 63-77, 84, 99, 
142, 166, 191, 217-218, 225, 
244, 245, 269-298, 333, 339, 
347, 352, 370, 391, 403, 404 
over-interpretation 298 

interpretation of two-dimensional 
plots 

biplots 91-103, 106, 107 
correspondence analysis 103-107 
PC plots 80-85, 89, 106, 107 
principal co-ordinate plots 89, 
106, 107 
interval data see data given as 

intervals 
interval estimation 

for PC coefficients and variances 
51-53, 394 
invariance 

scale invariance 26 
under orthogonal 
transformations 
' 21 

inverse of correlation matrix 32 
irregularly-spaced data 320, 331, 
365, 385 



isometric vector 53, 344, 345, 347, 

393, 401, 404 
Isle of Wight 161, 271 

jackknife 52, 125, 126, 131, 132, 
261, 394 

Kaiser's rule 114, 115, 123, 126, 

130-132, 238 
Kalman filter 335 
Karhunen-Loeve expansion 303, 

317 

kriging 317 
kurtosis 219 

Li-norm PCA 267 
Lagrange multipliers 5, 6 
Tanalyse des correspondances, see 

correspondence analysis 
landmark data 210, 323, 345, 346, 

369 

Laplace distribution 267 

large data sets 72, 123, 221, 333,' 

339, 372, 408, 414 
LASSO (Least Absolute Shrinkage 
and Selection Operator) 
167, 284, 286-291 
last few (low variance) PCs 3, 27, 
32, 34, 36, 56, 94, 112, 277, 
347, 352, 374, 377, 378 
computation 409, 410 
examples 43, 44, 58, 242-248 
in canonical correlation analysis 
223 

in discriminant analysis 202, 

204, 205, 207, 208 
in outlier detection 234, 235, 

237-239, 242-248, 263, 367 
in regression 171, 174, 180-182, 

186-188, 191, 197 
in variable selection 138, 

186-188, 191, 197 
minor component analysis 413 
treated as noise 53, 118, 128 
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see also hypothesis testing for 
equality of PC variances, 
near-constant relationships, 
residuals after fitting first 
few PCs, zero variance PCs 
latent root regression 168, 178, 

180-182,185-187,190,191, 

197, 239 
latent semantic indexing 90 
latent variables 151, 165, 226, 230, 

231 

latent variable multivariate 
regression 230, 231 
least squares estimation/estimators 
32, 34, 59, 157, 167-173, 
175-179, 181, 184, 185, 189, 
208, 229, 286, 288, 294, 304, 
326, 382, 385 
see also partial least squares 
leverage points 240 

see abo influential observations 
likelihood ratio test 54, 55, 120, 

353,. 356, 360 
linear approximation asymmetric 

PCA 401 
loadings see factor loadings, PC 

coefficients 
local authorities 

British towns 71, 215 
England and Wales 195-198, 

245-247 
English counties 108-110, 
215-219 
local PCA 381 

log-eigenvalue (LEV) diagram 

115-118,128, 134-136 
log transform see transformed 

variables 
longitudinal data 328, 330, 331, 
. 355 

lower (or upper) triangular 
matrices 182, 411, 412 

lower rank approximations to 

matrices 38, 46, 113, 120, 
342, 365, 383, 385 



LR algorithm 411 

M-estimators 264, 265, 267 
Mahalanobis distances 33, 93, 94, 

104, 203, 204, 209, 212, 237, 

264, 265 

manufacturing processes 366-368 
matrix correlation 96, 140, 141 
matrix- valued data 370 
maximum covariance analysis 225, 

226, 229, 401 
maximum likelihood, estimation 
220, 264 
for common PCs 355 
for covariance matrices 50, 336, 
363,364 
. for factor loadings 155-157 
for functional and structural 

relationships 189 
for PC coefficients and variances 

8, 50, 365 . 
in PC models 60, 222, 267, 364, 

386 

measurement errors 151, 188, 189 
medical applications 
biomedical problems 395 
clinical trials 40, 239 
epidemiology 248, 336 
opthalmology 266 
see also chemistry (blood 
chemistry) 
meteorology and climatology 8, 9, 
90, 183, 213, 381 
atmospheric pressure 71-73, 401 
cloudbase heights 211 
cloud-seeding 339 
monsoon onset date 174 
satellite meteorology 358 
wind data 369, 370 
see also atmospheric science, 
climate change/ variation, 
ENSO, NAO, temperatures 
method of moments estimation 
for PC coefficients and 
variances 50 
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metrics 42, 59, 60, 185, 189, 210, 
220, 260, 325, 331, 373, 382, 
386-388 
optimal metric 387 

minimax components 267 

minimum x 2 test 120 

minimum description length 19, 
39,395 

minimum spanning tree (MST) 

81-83, 130 
minimum variance ellipsoids 267 
misclassification probabilities, see 

discriminant analysis 
missing data 60, 61, 83, 134, 339, 
363-366, 412 
estimating covari- 
ance/corr elation 
matrices 363-365 
estimating PCs 365, 385 
in biplots 103, 104 
in designed experiments 353, 365 
in regression 363 
mixtures of distributions 61, 165, 

200, 221, 222, 241, 364 
modal dispersion matrix 395 
models for PCA 50, 54, 59-61, 119, 
124-126, 132, 151, 158-160, 
220, 364, 369, 405 
see also fixed effects model for 
PCA 

modified principal components 144 
most correlated components 26 
multichannel singular spectrum 
analysis (MSSA) 302, 305, 
307, 308, 310, 311, 316, 329 
multicollinearities 167, 168, 

170-173, 177, 180, 181, 185, 
188, 196, 286, 378 
predictive and non-predictive 
multicollinearities 180, 181, 
185, 188 
variance inflation factors (VIFs) 

173, 174 
see also ill-conditioning 



multidimensional scaling see 

scaling or ordination 

techniques 
multilevel models 353 
multiple correlation coefficient 25, 

141, 143, 174, 177, 191, 197, 

198, 403 

multiple correspondence analysis, 
see correspondence analysis 

multiple regression, see regression 
analysis 

multivariate analysis of variance 
(MANOVA) 102, 351, 353 

multivariate normal distribution 8, 
16, 18, 20, 22, 33, 39, 47-55, 
60, 69, 119, 152, 155-157, 
160, 201, 207, 220-222, 236, 
239, 244, 254, 264, 267, 276, 
299, 338, 339, 365, 367, 368, 
379, 386, 388 
curvature 395 

see also contours of constant 
probability, inference for 
PCs 

multivariate regression 17, 183, 
223, 228-230, 331, 352 

multiway PCA, see three-mode 
PCA 

near-constant relationships 

between variables 3, 13, 27, 
28, 42-44, 119, 138, 167, 
181, 182, 189, 235, 374, 377, 
378. 

nearly equal eigenvalues 43, 262, 
263, 276, 277, 360, 408, 410 
see also stability 
neural networks 200, 266, 373, 

379-381, 388, 400, 401, 405, 
408, 412-414 
analogue/digital 414 
autoassociative 381 
biological plausibility 413 
first or last PCs 400, 413 
input training net 381 
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PC algorithms with noise 400 
sequential or simultaneous 380 
single layer/multi-layer 413, 414 
see also computation, cross- 
correlation asymmetric 
PCA, linear approximation 
asymmetric PCA, oriented 

PCA . 
nominal data, see binary variables, 
contingency tables, discrete 
variables 
non-additive part of a two-way 

model, see interactions m a 
two-way model 
non-centred PCA, see uncentred 
PCA 

non-independent data, see sample 
surveys, spatial data, time 
series 

non-linear PCA 20, 102, 343, 365, 
373-382, 388, 400, 413 
distance approach 376, 385 
Gift approach 343, 374-377 
non-linear relationships 80, 85 
non-metric multidimensional 
scaling, see scaling or 
ordination techniques 
non-normal data/distributions 49, 

261, 373, 394-396 
normal (Gaussian) distribution 68, 
114, 131, 186, 189, 261 
probability plots 245 
see also multivariate normal 
distribution 
normalization constraints on PC 
coefficients 6, 14, 25, 30, 72, 
154, 162, 211, 237, 271, 277, 
278, 286, 291,297, 323, 387, 
404, 408, 410 
North Atlantic Oscillation (NAO) 
73, 296 

oblique factors/rotation 152-154, 
156, 162-165, 270, 271, 295, 
383 



see also rotation 
oceanography 8, 9, 303, 370 
O-mode to T-mode analyses 398 
optimal algebraic properties, see 

algebraic properties 
ordinal principal components 341 . 
ordination or scaling techniques, 
see scaling or ordination 
techniques 
oriented PCA 401 
orthogonal factors/rotation 

153-155, 161-165, 166, 
270-274, 291 
see also rotation 
orthogonal projections, see 

projections onto a subspace 
orthonormal linear transformations 

10, 11, 31, 37 
oscillatory behaviour in time series 
302-316,329 
propagating waves 309, 311, 314, 

316, 329 
standing waves 309, 311, 316 
outliers 81, 98, 101, 134, 137 Y219 
232-248, 262-265, 268,- 387, 
394 

Andrews' curves 110, 242 
cells in a data matrix 385 
in quality control 240, 366-368 
with respect to correlation 

structure 233-239, 242, 244, 
245,248 
with respect to individual 

variables 233-239, 242, 245, 

248 

see also detection of outliers, 
influential observations 



painters, see artistic qualities 
parallel analysis 117, 127-129, 131, 
262 

parallel principal axes 379 
partial correlations 127, 157 
partial least squares (PLSV 167, 
168, 178, 183-185, 208, 229 
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pattern recognition 200 
patterned correlation/covariance 
matrices 27, 30, 56-58 
all correlations equal 27, 55, 56 
all correlations positive 57, 58, 
67, 84, 99, 145, 148, 162, 
174,245 
all correlations (near) zero 54, 

55, 206, 348 

groups of correlated variables 
56-58, 114, 138, 167, 196, 
213 

widely differing variances 22, 40, 

56, 115, 134, 135 

see also structure of PCs, 
Toplitz matrices 
PC coefficients 

alternative terminology 6, 72 

arbitrariness of sign 67 

hypothesis testing 53 

interval estimation 52 

maps of coefficients 72, 73, 80, 
275, 284-283 

point estimation 50, 66 

use in variable selection 138, 
138, 141 

see also comparisons between 
PCs, computation of PCs, 
discrete PC coefficients, first 
few PCs, influence functions 
for PC coefficients, last 
few PCs, normalization 
constraints, probability 
distributions, rounded 
PC coefficients, sampling 
variation, simplified PC . 
coefficients, stability of 
PCs, structure of PCs 
PC loadings see PC coefficients 
PC regression 32, 123, 167-199, 
202, 245, 337, 352 

computation 46, 173, 408, 412 

interpretation 170, 173 

locally weighted 185 



PC scores 30, 31, 36, 39, 45, 72, 

169, 238, 265, 342, 362, 413 
PC series, see point estimation 
PC variances 

hypothesis testing 53-55, 114, 
117-120, 128, 129, 136 

interval estimation 51, 52 

lower bounds 57 

point estimation 50 

tests for equality 53-55, 118, 
119, 128, 134 

see also Bartlett's test, 
computation of PCs, 
first few PCs, influence 
functions for PC variances, 
last few PCs, probability 
distributions, rules for 
selecting PCs, sampling 
variation 
PCA based on Euclidean similarity 
391 

PCA of residuals/errors 240, 304, 
352, 353, 365, 391, 394 

see also interactions 
penalty function 278 

roughness penalty 325, 326, 377 
periodically extended EOFs 

314-316 
permutational PCs 339, 340 
perturbed data 259-262 
physical interpretion 

in ICA 396 

of PCs 132, 270, 296-298, 320 
modes of the atmosphere 132, 
296, 297, 391 
Pisarenko's method 303 
pitprops 190-194, 286, 287, 289 
point estimation 

for factor loadings 151-157 

for factor scores 153, 160 

for PC coefficients 50, 66 

for PC series 329 

for PC variances 50 

in econometrics 393, 394 
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in functional and structural 

relationships 189 
in functional PCA 318-320 
in regression 167-173, 175-186, 

304, 337 
see also bootstrap estimation, 

least squares estimation, 

maximum likelihood 

estimation, method of 

moments estimation, robust 

estimation 

power 

method 8, 409-411 . 
accelerated version 8, 410 
convergence 410 
with inverse iteration 410 
with shifts 410, 411 
prediction sum of squares (PRESS) 

121-124, 145, 175 
predictive oscillation patterns 

(PROPs)309 
predictor variables 227-230 

see also regression analysis 
pre-processing data using PCA see 

dimensionality reduction 
principal axes 

for contingency table data 342 
of ellipsoids 18, 22, 27, 39 
principal co-ordinate analysis 39, 
79, 85-90, 93, 106, 107, 209, 
339, 346, 382 
principal curves 373, 377-379, 381 
principal differential analysis 316, 
326 

principal factor analysis 159, 160 
see also point estimation for 

factor loadings 
principal Hessian directions 185 
principal oscillation pattern (POP) 

analysis 302, 303, 307-311, 

314-316, 335 ; 
principal planes 279 
principal points 379 
principal predictors 227, 228, 354 
principal response curves 331, 393 



principal sensitivity components 
240 

principal sequence pattern analysis 
308 

principal variables 139-141, 144, 
146-149, 368, 394, 395 
see also selection of variables 

probabilistic PCA, see models for 
PCA 

probability distributions 59 
asymptotic distributions 9, 

47-49, 51, 53 
empirical distributions 49, 128, 

129 

exact distributions 48 
for noise 388 

for PC coefficients and variances 
8, 9, 29, 47-49, 51, 53, 128, 
129 

see also models for PCA 
process control, see statistical 

process control 
Procrustes rotation 143, 145, 221, 

260, 362 . 
projection pursuit 79, 200, 219-221, 

241, 266, 387, 396 
projections 

in a biplot 94, 95 

onto a subspace 20, 21, 34-37, 

61, 140-141, 393, 399 
onto rotated axes 154 
properties of PCs, see algebraic 
properties, geometric 
properties 
psychology/psychometrics 7, 9, 
' " 117, 130, 133, 225, 296, 343, 
398 

QL algorithm 411-413 
convergence 412 
incorporating shifts 412 

QR algorithm 411-413 

quality control, see statistical 
process control 

quantitative structure-activity 
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relationships (QSAR), see 
chemistry 
quaxtimin/quartimax rotation 153, 
154, 162-165, 270, 271, 277, 
278 . 

quaternion valued data 370 

ranked data 267, 338, 340, 341, 

348, 349, 388 
rank correlation 341 
reciprocal averaging, see scaling or 

ordination 
red noise 301, 304, 307, 314 
reduced rank regression 229, 230, 

331, 353, 392, 401 
softly shrunk reduced rank 

regression 230 
reduction of dimensionality, see 

dimensionality reduction 
redundancy analysis 225-230, 331, 

393, 401 

redundancy coefficient/index.226, 
227 

redundant variables, see 

dimensionality reduction 
regionalization studies 213, 294 
regression analysis 13, 32, 33, 74, 
111, 112, 121, 127, 129, 137, 

144, 145, 157, 167-199, 202, 
205, 223, 227, 239, 240, 284, 
286, 288, 290, 294, 304, 326, 
337, 352, 363, 366, 368, 378, 
390, 399, 412 

computation 46, 168, 170, 173, 

182, 412 
influence function 249, 250 
interpretation 46, 168, 170, 173, 

182, 412 
residuals 127, 399 
variable selection 111, 112, 137, 

145, 167, 172, 182, 185-188, 
190, 191, 194, 197, 198, 286 

see also biased regression ' 
methods, econometrics, 
influence functions, 



latent root regression, 
least squares estimation, 
multivariate regression, PC 
regression, point estimation, 
reduced rank regression, 
ridge regression, robust 
regression, selection of 
variables 

regression components 403 

regression tree 185 

regularized discriminant analysis 
205, 207, 208 

reification 269 

repeatability of PC A 261, 394 
repeated measures, see longitudinal 
data 

rescaled PCs 403, 404 

residual variation 16, 17, 108, 114, 

129, 220, 240, 290, 399 
see also error covariance matrix, 

PCA of residuals 
residuals in a contingency table, 

see interactions 
response variables 227-230 

PCs of predicted responses 228, 

230 

see also regression analysis 
restricted PC regression 184 
ridge regression 167, 178, 179, 181, 

185, 190, 364 
road running, see athletics 
robust estimation 232, 262-268 
in functional PCA 266, 316, 327 
in non-linear PCA 376 
in regression 264, 366 
of biplots 102, 265 
of covariance/ correlation 

matrices 264, 265-267, 363, 
364, 394 
of distributions of PCs 267 
of means 241, 264, 265 
of PCs 50, 61, 233, 235, 263-268, 

356, 366, 368, 394, 401 
of scale 266 

see also M-estimators, minimum 
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variance ellipsoids, 

S-estimators 
rotation 

of factors 116, 153-156, 159, 

162-165 
of functional PCs 316, 327 
of PCs 43, 74, 151, 154, 162, 

163, 165, 166, 182, 185, 188, 

191, 213, 238, 248, 269-279, 

291, 295, 297, 298, 370, 396, 

407 

of subsets of PCs with similar 

variances 276, 277 
rotation/switching of PCs 259 
to simple components 285, 291 
to simple structure 153, 154, 

182, 185, 270, 271, 276, 277, 

369 

see also oblique factors, 
orthogonal factors, 
quartimin rotation, varimax 
rotation 

rounded PC coefficients 40, 42-44, 
67, 259, 263, 292, 293 
see also discrete PC coefficients, 
simplified PC coefficients 
row-centered data 89 
rules for selecting PCs 54, 111-137, 
159, 162, 217 
ad hoc rules 112-118, 130-135, 

136, 138, 147, 149, 238 
based on cross-validation 112, 

120-127, 130-132, 135-137 
based on cumulative variances of 
PCs 112-114, 117, 130-136, 
138, 147, 147 
based on gaps between 

eigenvalues 126, 127, 129, 
133 

based on individual variances of 
PCs 114-118, 123, 130-136, 
138, 147, 149, 238 
based on partial correlation 127 
from atmospheric science 112, 
116, 118, 127-130, 132-136 



statistically based rules 112, 
118-137 

see also broken stick model, 
equal variance PCs, how 
many PCs, Kaiser's rule, 
log eigenvalue diagram, 
parallel analysis, scree 
graph, selection of a subset ' 
of PCs 

RV-coefficient 38, 143-145, 147, 
252 

S-estimators 267 
S-mode analysis 308, 398 
sample sizes 

effective/equivalent 129, 148, 

299 

large see large data sets 
moderate 249, 252 
small 65,68, 148, 235, 257 
smaller than number of variables 
90, 148, 207, 413 
sample surveys 49, 328, 335, 336, 
353, 353 
stratified purvey design 336, 353 
sampling variation 
PC coefficients 65 
PC variances 115, 123 
scale dependence of covariance 
matrices 24, 26 
see also invariance (scale 
invariance) 
scaling or ordination techniques 
85-90, 102, 106, 107, 200 
classical scaling 85 
dual scaling 103, 343 
non-metric multidimensional 

scaling 86, 372 
reciprocal averaging 103, 343 
see also principal co-ordinate 
analysis 
scatter, definitions of 395 
scores for PCs, see PC scores 
SCoT (simplified component 

technique). 278-279, 287, 
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289, 290, 291 
SCoTLASS (simplified component 

technique - LASSO) 

280-283, 287-291 
scree graph 115-118, 125, 126, 

130-132, 134, 135 
selection of subsets of PCs 
in discriminant analysis 201, 

202,204-206 
in latent root regression 180, 181 
in PC regression 168, 170-177, 

196-198, 202, 205, 245 
see also how many PCs, rules 

for selecting PCs 
selection of variables 

in non-regression contexts 13, 

27, 38, 111, 137^149, 186, 

188, 191, 198, 220, 221, 260, 

270, 286, 288, 290, 293-295, 

376 

stepwise select ion/backward 

elimination algorithms 142, 

144, 145, 147 
see also principal variables, 

regression analysis (variable 

selection) 
self-consistency 20, 378, 379 
sensible PCA 60 
sensitivity matrix 240 
sensitivity of PCs 232, 252, 

259-263, 278 
shape and size PCs, see size and 

shape PCs 
.Shapiro- Wilk test 402 
shrinkage methods 167, 178-181, 

264, 288 
signal detection 130, 304, 332 
signal processing 303, 317, 395 
signal to noise ratio 337, 388, 401 
SIMCA 207-208, 239 
similarity measures 

between configurations 38 
between observations 79, 89, 

106,210-212,339,390 
between variables 89, 213, 391 



see also distance/dissimilarity 

measures 
simple components 280-287, 291 
simplicity/simplification 269-271, 

274, 277-286, 403, 405 
simplified PC coefficients 66, 67, 

76, 77 

see also approximations to PCs, 
discrete PC coefficients, 
rounded PC coefficients 
simultaneous components 361 
singular spectrum analysis (SSA) 

302-308, 310, 316 
singular value decomposition 

(SVD) 7, 29, 44-46, 52, 59, 
101, 104, 108, 113, 120, 121, 
129, 172, 173, 226, 229, 230, 
253, 260, 266, 273, 353, 365, 
366, 382, 383 

comparison of S VDs 362 

computation based on SVD 46, 
173, 412, 413 

generalized SVD 46, 342, 383, 
385, 386 

multitaper frequency domain 
SVD (MTM-SVD) 302, 311, 
314, 316 

size and shape PCs 53, 57, 64, 67, 
68, 81, 104, 297, 298, 338, 
343-346, 355, 356, 388, 393, 
401 

see also contrasts between 
variables, interpretation 
of PCs, patterned 
correlation/covariance 
matrices 
skewness 219, 372 
smoothing and interpolation 274, 
316, 318, 320, 322, 324-326, 
334, 335,. 377-379 

of spatial data 334, 335, 364, 365 

lo(w)ess 326 

splines 320, 322, 331, 377, 378, 
387 

sparse data 331 
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spatial correlation/covariance 297, 
302, 317, 333-335 
intrinsic correlation model 334 
isotropy and anisotropy 297, 334 
linear model of co-regionalization 
334 

non-stationarity 297 
spatial data 71-74, 130, 274, 275, 
278-283, 289, 294, 295, 300, 
302, 307-317, 320, 328, 329, 
332-339, 364, 365, 370, 385, 
398 

spatial lattice 368 
spatial domain, size and shape 
297, 334 

species abundance data 105-107, 
224-225, 339, 371, 372, 
389-391 

between- and within-site species 
diversity 372, 389 

spectral decomposition of a matrix 
13, 14, 31, 37, 44, 46, 86, 
87, 101, 113, 170, 171, 266, 
333, 344, 355, 368, 395, 404 
weighted 207 

spectral/spectrum analysis of a 
time series 300, 301, 311, 
337 

spectrophotometry, see chemistry 

sphering data 219 

splines see smoothing and 

interpolation 
stability/instability 
of PC subspaces 42, 53, 259, 261 
of PCs and their variances 76, 
81, 118, 126, 127, 232, 
259-263, 267, 297 
of spatial fields 130 
see also influence function, 
influential variables 
standard errors for PC coefiicients 

and variances 50, 52 
standardized variables 21, 24-27, 
42, 112, 169, 211, 250, 274, 
388, 389 



statistical physics 266, 401 
statistical process control 114, 184, 
240, 333, 337, 339, 366-369, 
381, 398 
CUSUM charts 367 
exponentially- weighted moving 
principal components 337, 
368 

squared prediction error (SPE) 
367, 368 

stochastic complexity 19, 39, 395 
strategies for selecting PCs in 

regression 
see selection of subsets of PCs 
structural relationship's, see 

functional and structural 

relationships 
structure of PCs 24, 27, 28, 30, 

56-59 

PCs similar to original variables 
22, 24, 40, 41, 43, 56, 115, 
127, 134, 135, 146, 149, 159, 
211, 259 
see also contrasts between 
variables, interpretation 
of PCs, patterned 
correlation / covariance 
matrices, PC coefiicients, 
size and shape PCs 
student anatomical measurements, 
see anatomical 
measurements 
Sturm sequences 411 
subjective PCs 404 
subset selection, see selection of 
subsets of PCs, selection of 
variables 
subspaces 

spanned by subsets of PCs 43, 
53, 140, 141, 144, 229, 230, 
259, 261, 276, 357-361 
spanned by subsets of variables 

140, 141, 144 
see also comparisons between 
subspaces 
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supervised/unsupervised learning 
200 

SVD analysis, see maximum . . 

covariance analysis 
SVD see singular value 

decomposition 
sweep-out components 403 
switching of components 259 

t-distribution/t-tests 186, 187, 191, 
193, 196, 197, 204, 205 ' 
multivariate t-distribution 264, 
364 

T-mode analysis 308, 398 
temperatures 22, 274, 316, 332 
air temperatures 71, 211, 302, 

303, 329 
sea-surface temperatures 73, 
211, 274, 275, 278-283, 286, 
289, 310-314, 364, 396 
tensor-based PCA 398 
three-mode factor analysis 397 
three-mode PCA 368, 397, 398 
time series 49, 56, 72, 74, 76, 128, 
129, 148, 274, 290, 298-337, 
360, 365, 369, 370, 384, 393, 
397, 398, 401 
co-integration 330 
distributed lag model 337 
moving averages 303, 368 
seasonal dependence 300, 303, 

314, 315 
stationarity 300, 303, 304, 314, 

316, 327, 330 
tests for randomness (white 

noise) 128 
see also autocorrelation, 

autoregressive processes, 
frequency domain PCs, red 
noise, spectral analysis, 
trend, white noise 
Toplitz matrices 56, 303, 304 
transformed variables 64, 248, 374, 
376, 377, 382, 386 
logarithmic transformation 24, 



248, 344, 345, 347-349, 372, 
388, 390 

trend 148, 326, 336 

removal of trend 76, 393 

tri-diagonal matrices 410 

truncation of PC coefficients 67, 
293-296 

two-dimensional PC plots 2-4, 
78-85, 130, 201-203, 212, 
214-219, 234-236, 242-247, 
258, 299 
see also biplots, correspondence 
analysis, interpretation 
of two-dimensional plots, 
principal co-ordinate 
analysis, projection pursuit 

two-stage PCA 209, 223 

uncentred 'covariances' 290, 390 
uncentred PCA 41, 42, 349, 372, 
389, 391 

units of measurement 22, 24, 65, 
74, 211, 274, 374, 388, 391 

upper triangular matrices, see 
lower triangular matrices 

variable selection, see selection of 

variables 
variance inflation factors (VTFs), 

see multicollinearities 
variances for PCs, see PC 

variances 
variation between means 60, 85, 

96, 158 
varimax rotation 153, 154, 

162-165, 182, 188, 191, 238, 

270, 271, 274, 277-278 
vector- valued data 129, 369, 370 

weighted PCA 21, 209, 241, 330, 

353, 382-385 
weights 

exponentially decreasing 337, 
368,384 
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for covariance matrices 264, 265, 

337, 384 
for observations 103 260-262 

264-266, 268, 373, 383-386, 

390 
for PCs 354 

for variables 21, 383-385 
in fixed effects model 60, 96, 

124, 220, 267, 330, 386 
in singular value decomposition 
230, 266, 383, 384 
well separated eigenvalues, see 
nearly equal eigenvalues 
white noise 128, 301, 304 
multivariate white noise 302 



Winsorization 266 
Wishart distribution 47 
within-group PCs 201-209, 

212-214, 352 
within-group variation 201-209, 

212, 220, 351, 399 
within-treatment (or block) PCs, 
see PCs of residuals 

Yanai's generalized * 
determination (GOD) 
140, 141, 144, 252 

zeros in data 348, 349, 372 
zero-variance PCs 10, 27, 42, 43, 
345, 347, 359, 390 
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